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Abstract 

We consider the dynamics of diluted neural networks with clipped and adapting 
synapses. Unlike previous studies, the learning rate is kept constant as the connectivity 
tends to infinity: the synapses evolve on a time scale intermediate between the quenched 
,_!. ' and annealing limits and all orders of synaptic correlations must be taken into account. 

The dynamics is solved by mean-field theory, the order parameter for synapses being a 
function. We describe the effects, in the double dynamics, due to synaptic correlations. 



PACS numbers: 87.10.+e, 05.20.-y 
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In the past years, many models with a coupled dynamics of fast Ising spins and slow inter- 
actions have been studied to understand the simultaneous learning and retrieval in recurrent 
neural networks jQ Q. A major approach to this problem is replica mean-field theory with 
the replica number being the ratio of two temperatures characterizing the stocasticity in the 
spin dynamics and the interaction dynamics, respectively [§, £§]. Recently this approach has 
been used to study coupled dynamics in the XY spin glass || |6|; the generalization of these 
ideas [/j to the case of a hierarchy of subsystems with different characteristic time-scales, 
in the Sherrington-Kirckpatrick model, interestingly leads to Parisi's solution ||. Other 
approaches to coupled dynamics in neural networks are described in ||, using a Discrete 
Time Master Equation approach, and in fll0|| , exploring temporal learning rules. Stochastic 



learning rules in diluted neural networks were considered in JTTJ|: it was shown that in order 



to preserve the associative memory capability of the network the learning rate q must be 



kept very small (e.g., q = 0(1/ K), where K is the connectivity). Moreover, in [|IT[] the 



choice of a very small learning rate implied that the correlation between synaptic variables 
could be neglected so that the dynamics was solved by flow equations for a few number of 
macroscopic order parameters. It is the purpose of this work to reconsider coupled dynamics 
in diluted neural networks and keep the learning rate fixed as the connectivity K tends to 
oo. The dynamics of the network, in this limit, can be exactly solved by taking into account 
all the orders of correlations between synapses, the order parameter for synapses being a 



function on the interval [—1,1]. According to the argument in [11], the functioning of this 
model as an associative memory is questionable; we regard it as a simple model to analyze 
the effects due to synaptic correlations in the double dynamics. 



As in [11| we consider a diluted neural network with uni-directional synapses obeying a 



stochastic learning mechanism [ I2| . The model is made of N three states neurons s« = 0, ±1, 
each connected (by binary synapses J^- = ±1) to K input sites, chosen at random among 
the N sites. The parallel rule for updating synapses is the following: with probability q 
each synapse .1^ assumes the value SiSj if this product is not zero; otherwise the synapse 
remains unchanged. A parallel stochastic dynamics with inverse temperature (3 is assumed 
for neurons, where the local field acting on neuron Sj is given by hi = {J2 JijSj)/K, the 
sum being over the input neurons. The coupled dynamics consists in alternate updating 
of neurons and synapses. We will consider the limit N, K — ► oo with K « In N: it is 
well known [13J that neurons can then be treated as i.i.d. stochastic variables. Moreover 



we choose q constant as K — > oo: q controls the ratio between the time scales over which 
neurons and synapses evolve and the adiabatic approximation is recovered by sending q to 
WM. As a consequence, in the present case one can not neglect the correlations among 



zero 



synapses. 

Let us denote s\, s 2 , ,Sk the input neurons and J%, J 2 , ...., Jr the set of K input 

synapses for a given neuron s (due to the translational symmetry the following reasoning 
holds for an arbitrary Sq). 

We start considering the following simple situation: the synapses being independently 
updated by the transition matrix: T(J|J') = Ilf-i T (Ja\J' a ), where the transition matrix 
for the single synapse is the following: 




T = 

A good order parameter for synapses is x = (J2a=i Ja)/K G [—1,1]. Indeed, denoting with 
pt{x) the pdf for x at time t, one can demonstrate (see the Appendix) that in the large K 
limit the evolution of x is ruled by a deterministic Liouville operator: 

Pt+x{x) = J dy5 (x -x(y)) p t (y) (1) 

with x = B — A + y(l — A — B). The moments of p t provide the synaptic correlations: 

(x p ) t = fdx x p p t (x) = (JMJ® . . . J<P>) t , (2) 



where the synapses J^ l \ J^, . . . J^ are all different. The probability distribution, at time 
t, for the local field acting on neuron sq is 

P t (h) = — pt[ — , he [-m t , m t ], (3) 

m t \m t J 

where m t is (s)$, the average magnetization of the neuronic configuration. We will denote 
Qt = (s 2 )t the activity of neurons, satisfying Q t > m t for every time t. 

Let us now come back to our problem. Due to the synaptic learning rules, the values of 
A and B now depend on the value of so- If so = then A = B = and x — y. If so = 1 then 
A = q(&=Z±), B = q(&%*) and x = qm t + y (1 - qQ t ). If s = -1 then A = q (^p), 

B = q ( Q±=p± j and x = —qm t + y (1 — gQt)- This implies that even if at time £ we know x 
exactly (i.e., pt is a <5- function) , at time t + lxis not determined (pt+i will generically be a 
convex sum of three <5's). The full distribution p now plays the role of order parameter for 
the synaptic variables, the time evolution law being given by a mixture of three Liouville 
operators: 

p t+1 (x) = (1 - Q t )p t {x) + ^9 (l - |fE^|) fH (fEfg) 



+ 2(l- 9 Q t ) l 



x+qmt 
1-qQt 






is Heaviside's function. 

Let us now consider the dynamics of neurons. We assume the following form for the 
conditional probability for neurons: 

P(st+i\h) oc exp(3(hs t+1 + asj? +1 ), (5) 

where s t is the vector of neurons at time t, and a controls the mean activity of the network. 
The time evolution law for neuronic order parameters is then given by 



m(t + 1) = J\ dxpt{x) 2 JwJ£^- P a 

Q (t + i)= s\ dx Pt (x) j-;Z:^ a . 



2cosh(f3xmt) (."/ 



These two equations, together with (4) and the initial conditions, m , Q and po(x), solve 
the double dynamics for the present model. 

Now we turn to analyze the flow equations. Firstly we consider the case of m and Q 
being kept constant: p t tends asymptotically to the invariant distribution p^ of (4). One 
can easily derive a recurrence formula for the moments of the stationary distribution: 

(^)oo = £ ( I ) (1 - qQ)^ (qm) k <*-% + | E ( I ) (1 " ^T'" W (^U 

where J2 (Z) ) is over even (odd) positive integers less than or equal to n. The invariant 
distribution is a 5 -function in the following cases. If m = then p^ = 5(x). If m — ±1 



then poo = 5(x — 1), and in the adiabatic limit q — > we have p^ — > 5(x — m 2 /Q 2 ). In the 
general case the first two cumulants are given by: 



<*>c 

which is independent of q, and 
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m 

~Q 2 



«*--«"--i^ ($-£)■ (9) 

The last formula clearly shows how the synaptic correlations are controlled by the learning 
rate q. For example, in Figure 1 the invariant distribution of (4), we numerically find, is 
depicted (for q = 0.06, Q = 0.8, and m = 0.5). We compare it with the x-distribution, 
over time, we find simulating a system of K synapses, evolving by the stochastic learning 
mechanism, where neurons So and {s a } are independently sampled with (s) = m and (s 2 ) = 
Q at each time step. The agreement with the theoretical curve increases as K grows and it 
is fairly good already for K = 500 (see Fig. 1). 

The stationary regime of the coupled dynamics shows a paramagnetic phase with m = 
and a ferromagnetic phase with rn^O |15| . By numerical analysis we find the transition line 
between the two phases in the j3 — a plane: in Figure 2 our results are shown for some values 
of q. At fixed a, the critical temperature decreases as q is increased: the synaptic correlations 
seem to amplify the disordering capability of thermal noise. The two phases are separated 
by a first order transition, in agreement with @ where the para-ferro transition changes from 
second to first order as the influence of spins on the couplings dynamics becomes dominant. 

Let us now study the role of adapting synapses in the damage spreading phenomenon 
(see, e.g., [pf] ). For simplicity we assume two state neurons s = ±1, and we work in the 
disordered phase m — 0. We assume the local fields to be: 

^JySj + Bi (10) 



K 

where B t are random magnetic fields whose Gaussian distribution has variance B, and the 
normalization has been chosen differently from the previous case so as to have a non trivial 
K — > oo limit in this case. We assume to be at zero temperature and consider two replicas of 
the system, subject to the same random fields and the same noise in the stochastic learning 
mechanism. We introduce the order parameters A and e defined as follows: |(1 + A) is the 
probability that two corresponding synapses, in the two replicas, are equal, while |(1 + e) 
is the probability that two corresponding neurons, in the two replicas, are equal. As in 
the previous section, one easily finds that even if A is exactly known at a certain time, it 
is not determined al later times: it must be described by a probability distribution r t (A), 
whose evolution is given by eq.(4) with Q = 1 and m t replaced by e t . While keeping fixed 
A, the variables {Js} are equal, in the two replicas, with probability |(1 + Ae). Therefore 
the local fields in the two replicas can be written h\ = X + Y and h 2 = X — Y, where X 
and Y are random Gaussian variables with variance, respectively ax — (1 + Ae)/2 + B and 



oy = (1 — Ae)/2. One can then easily obtain the time evolution law for e: 



e m = 1 - - C dAF t (A)tan-\ — V^ — • (H) 

Studying damage spreading is equivalent to check the stability of the trivial fixed point e = 1 
and T = 5 (A — 1), corresponding to two identical replicas. We find that, for every finite B, 
damage spreading occurs and a nontrivial fixed point e* < 1 is stable. For low values of q 
the stationary distribution T is peaked around its average e 2 : approximating the tan' 1 by 
Taylor expansion at the second order around A = e 2 , the equation for the fixed point reads: 



4 / 1 — f * 3 C Bf* 2 

e * = i _ Itan^J——^ + — r , (12) 

n Vl + e* 3 + 2B 7r ((i_ e *3)( 1 + e *3 + 2 5))2 

where C = (A 2 ) - (A) 2 = q(e* 2 - e* 4 )/(2 - q) at equilibrium. 

The solution e* of the equation above is the asymptotic correlation between neurons 
in the two replicas as a function of q. In Figure 3 we depict de*/dq\ q= o versus B. Since 
we find this quantity to be always positive, it follows that the synaptic correlations act 
against the damage spreading phenomenon and tend to increase the correlation between the 
configurations of neurons in the two replicas, as one might intuitively expect. 

We have described an exactly solvable model of double dynamics where synaptic correla- 
tions, arising from a stochastic learning mechanism, are important at all orders. The order 
parameter for synapses in the mean-field dynamical theory is a function whose evolution is 
given by a mixture of Liouville operators. The critical temperature for the ferromagnetic 
transition is found to decrease as the learning rate increases: there is a wide range of tem- 
peratures such that the system may order or not depending on the speed at which it adapts, 
and ordering is asymptotically achieved only if the adaptation is sufficiently slow. We also 
outlined the role played by synaptic correlations in the damage spreading phenomenon. 

Appendix 

We show the validity of equation ([[]). Using the same notation as in the text, let P t (J) be 
the pdf for synapses at time t. Then 

P t+1 (3)=Tr 3 ,T{3\3')P t {3'). (13) 

It is useful to observe that, due to the symmetry of our problem, the distribution P t (J) 
will be symmetric under permutations of synapses (provided initial conditions respect the 
symmetry). It follows that P t is a function of the only non-trivial invariant for permutations 
one can build out of K binary variables, i.e. x = jf Y.a=i Ja- 

After standard calculations [[[]], the probability distribution for x, pt{x), is found to evolve 
according to 

Pt+i{x) = J_ dyW K {x,y)fh(y), (14) 



where the time-independent kernel Wk is given by 

„, , > Tr 3 Tr y 5 (y - ± E J') 8 (x - ± E j) ULi r (J a \J> a ) 

W K (x, y) = * f- f , . (15) 

Tr r 5 [y - ± E J') 



The structure of this kernel is, in the limit K — > oo: 

TS~ 2t plOO /'ZOO 

W K (x,y) = -—i dX diie KF ^™\ (16) 

( ZTtt I J— ioo J— ioo 



where 



F(X, n, x, y) = L(X, /i) - S(y) - Xy - fix, (17) 

S(y) = -i±^ to9 i±^-i^ to9 l^, (18) 

e L(\,») = (i _ A)e A+M + Ae A ^ + Se _A+ ^ + (1 - B)e _A ^; (19) 

the time evolution for the synaptic distribution is then given by the following equation: 

P*&) = WV dy dX l d^e KF ^^p t ( y ). (20) 



IX 



As a consequence, in the large K limit the integral in (|20"D is dominated by the physical 
saddle point, this means that the evolution operator W becomes, in the large K limit, a 
Liouville operator, describing a deterministic evolution. The saddle point is determined by 
the equations: dF/dX = 0, dF/dfi = 0, dF/dy = 0. After a little algebra, it turns out 
that at the saddle point the relation x = B — A + y(l — A — B) holds. Since Wk is (by 
construction) normalized for every K, also the limiting kernel, as K goes to infinity, will be 
normalized: we can then conclude that the limiting kernel is given by 5 (x — x(y)), where 
x = B- A + y(l-A- B). 
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Figure Captions 



Figure 1: The dashed lines represent the ^-distributions from numerical simulations for 
K = 20 (1), K = 100 (2), K = 200 (3), K = 500 (4), to be compared with the 
invariant distribution of (4), here represented by the solid line. The case q = 0.06, 
Q = 0.8 and m = 0.5 is here considered. 

Figure 2: In the plane j3 — a of parameters (see the text), the transition lines between 
the ferro and paramagnetic phases are depicted, for q = (continuous line), q = 0.02 
(dashed line) and q = 0.05 (dotted line). 

Figure 3: Concerning the damage spreading phenomenon, y = de* /dq\ q= o is depicted versus 
the variance of random fields, B (see the text). 
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